1. Getting started 😄

Today we will learn about plotly, an R package that allows you to build interactive graphs. To shake things up a bit, we’ll be working with a dataset that contains the information of all 801 Pokémon in the world!

This dataset, from package Rokemon, is based on the hit video game franchise. Every Pokémon species has its own characteristics when it comes to height, weight, attack power, defense power, speed, hit points (stamina), among others. Gotta analyze’em all!

#Please install the Rokemon package

#Make sure you have the "devtools" package, if not, install it first.

###install.packages("devtools")
###library(devtools)

devtools::install_github("schochastics/Rokemon")

#Load the packages

library(Rokemon)
library(plotly)
library(tidyverse)
library(dplyr)
library(readxl)

#Dataset

pokemon <- pokemon

####Gotta analyze’em all!

2. Plotly vs Ggplot 🔥

  • Ggplot provides meaningful, appealing, but static graphs. This limits the users’ possibilities to analyse the data.

  • Plotly allows users to interact with graphs on a wide variety of forms: zoom in/out of the plot, hover over a point, filter categories, among others.

  • Interactive graphs are useful for having a deeper understanding of the patterns of data.

3. Examples 📉

3A. Using ggplot

Let’s take a closer look of our dataset.

pok3a <- ggplot(pokemon, aes(x=attack, y=speed)) +
  geom_point(shape=1, alpha=0.5) +   
  ggtitle("Fig. 1: Attack vs Speed ", subtitle = "Built with Ggplot") +
  labs(y="Attack", 
       x = "Speed", 
  caption = "Source: Pokemon")+
  theme_minimal()

pok3a

This graphic looks cool! We can clearly see a positive correlation between attack and speed. But what would happen if we add the Pokémons’ name to the graph? Imagine that we would like to know to which Pokémon belongs each dot in order to build the fastest and strongest Pokémon team. It would look like this:

3B. Adding names in Ggplot

pok3b <- ggplot(pokemon, aes(x=attack, y=speed, label= name)) +
  geom_point(shape=1, alpha=0.5) +
  geom_text(size=3, hjust=1, vjust=1) +
  ggtitle("Fig. 1B: Attack vs Speed ", subtitle = "Built with Ggplot") +
  labs(y="Attack", 
       x = "Speed", 
  caption = "Source: Pokemon")+
  theme_minimal()

pok3b

Graphic 3B is not clear, you can not identify the Pokémon. So… here comes plotly.

3C. Using Plotly

pok3c <- pokemon%>%
  plot_ly(x = ~attack, y = ~speed, type = 'scatter', mode = 'markers',text=~name,
          marker=list(color="blue", size=10))%>%
  layout(title= "Fig. 1: Attack vs Speed",
         xaxis = list(title = list(text = 'Speed')),
         yaxis = list(title = list(text = 'Attack')))

pok3c

Plotly allows the user to hover over the individual observations and see hover text corresponding to the point on the plot. Now, you can see each Pokémon’s name without clutter.

Much better right? Let’s keep pushing!

3D. Adding more elements to plotly

pok3d_df <- pokemon
pok3d_df$generation <- as.character(pok3d_df$generation)

# pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "1"] <- "Legendary"
# pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "0"] <- "Not legendary"

pok3d <- pok3d_df %>%
  plot_ly(x = ~attack, y = ~speed, color =~generation,
          type = 'scatter', mode = 'markers', text= ~name,
          colors = c("#D64E12", "#F9A52C", "#EFDF48", "#8BD346", "#60DBE8", "#16A4D8", "#9B5FE0")) %>%
  layout(title= "Fig. 1: Attack vs Speed",
         xaxis = list(title = list(text = 'Speed')),
         yaxis = list(title = list(text = 'Attack')))

pok3d

This graph allows us to map the Pokémon by the game they first appeared in (generation).

Plotly lets you filter observations by clicking on the legend. Try singling out only Pokemon from the first generation by double-clicking on “1” on the right-hand side.

4. Other type of cool plots 🦸

Besides scatterplots, you can also create other type of graphics in Plotly such as bubble charts, histograms, box plots, among others. Let’s see some cool examples!

Remember, you can be anything you want to be.

4A. Bubble charts

##In this example we want to see if there is any correlation between capture rate and total stats average of Pokémon types using a bubble chart.

pokemon_count <- pokemon %>%
  count(type1)

pokemon_sub <- pokemon %>%
  group_by(type1) %>%
  summarise(ave_capture = mean(capture_rate, na.rm=T), 
            ave_power = mean(base_total, na.rm = T))
  
pokemon_sub <- merge(pokemon_sub, pokemon_count, by = "type1")

type_colors <- c('#A8A77A', '#EE8130', '#6390F0', '#F7D02C', '#7AC74C','#96D9D6', '#C22E28', '#A33EA1', '#E2BF65', '#A98FF3', '#F95587', '#A6B91A', '#B6A136', '#735797', '#6F35FC', '#705746', '#B7B7CE', '#D685AD')

pok4a <- plot_ly(pokemon_sub, x = ~ave_power, y = ~ave_capture, 
                 hovertext = ~paste('</br> Name: ', type1,
                                    '</br> Number of Pkmn: ', n), 
                 type = 'scatter', mode = 'markers',
                 size = ~n, color = ~type1, colors = type_colors,
                 marker = list(opacity = 0.5, sizemode = 'diameter'))

pok4a <- pok4a %>% layout(title = 'Capture rate vs Base stats total per Pokémon Type',
         xaxis = list(title = list(text = 'Base stats total')),
         yaxis = list(title = list(text = 'Avergage Capture Rate')),
         showlegend = FALSE)

pok4a

In this bubble chart, you can hover over the circles to see what Pokémon type it corresponds to and how many Pokémon belong to that type.

(For Pokémon nerds out there, this plot only takes into account primary typing! So sorry, Flying types.)

4B. Histograms

pok4b <- pokemon %>%
  plot_ly(alpha = 0.6) %>%
  add_histogram(x = ~defense,
                name = "Defense") %>% 
  add_histogram(x = ~attack,
                name = "Attack") %>% 
  layout(barmode = "overlay",
         title = "Histogram",
         xaxis = list(title = "Point Average",
                      zeroline = FALSE),
         yaxis = list(title = "Frequency",
                      zeroline = FALSE))

pok4b

Wow! Pokemons’ defense and attack points also follow a normal distribution!

4C. Box plots

pok4c <- plot_ly(data = pokemon,
  y = ~base_total,
  x = ~generation,
  type = "box",
  showlegend = FALSE)

pok4c

This boxplot showcases on average how strong (base_total) the Pokémon in each generation of games are. Hovering over each bin with your mouse will reveal summary statistics of each group.

5. Building your first Plotly graph 😎

5A. Let’s get started!

This tutorial level will walk you through on how to catch your first Pokémon build your first plotly plot. There are two ways to do this:

  • with the plot_ly() function
  • with the ggplot_ly() function, which translates a ggplot into plotly

We’re going to have a more in-depth look at the first option.

Let’s say that I want to have a look at the Pokémon from the first Pokémon game, released in 1996. My theory is that a Pokémon’s hit points total is positively correlated with its defense stat, as a more protected, defensive Pokémon will last longer in battle.

I can map this out easily with plotly via a scatterplot. To choose what plot I want, I use the type argument within plot_ly(). For a full list of attributes and arguments that can be passed along in plotly, see schema().

Notice that I call the variables with a ~. This is important to plotly’s syntax.

# cleaning data
pokemon_gen13 <- filter(pokemon, generation == "1")

# constructing our plotly graph
pok5a <- plot_ly(pokemon_gen13, x = ~defense, y = ~hp, type = 'scatter')

pok5a

Visually, there does seem to be a positive correlation with these two statistics.

While it’s a nice scatterplot, this hardly takes advantage of plotly’s interactive capabilities. Let’s make our plot better by introducing detail and plotly’s signature feature, hover text.

5B. Leveling up your plots

# cleaning data
pokemon_gen13 <- filter(pokemon, generation == "1")

pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "1"] <- "Legendary"
pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "0"] <- "Not legendary"

# constructing our plotly graph
pok5b <- plot_ly(pokemon_gen13, x = ~defense, y = ~hp,
               type = 'scatter',
               color = ~is_legendary, 
               colors = c('#BF382A', '#0C4B8E'),
               opacity = 0.5,
               hovertext = ~paste('</br> Name: ', name,
                             '</br> Species: ', classfication,
                             '</br> Type: ', type1, '/', type2)) %>% 
layout(title = "Hit points by Defense points",
  xaxis = list(title = list(text = 'Defense statistic')),
  yaxis = list(title = list(text = 'Hit points')))

pok5b

I used the following functions to add more detail:

  • color() allows you to create a legend for your plot. Here, I separated the observations based on if the Pokémon is legenday (read: one of a kind) or not.
  • colors() assigns colors to your legend. Supports hexcodes, RBG values…
  • opacity() modifies observation markers, useful if you have many overlapping ones.
  • hovertext() is the big deal here. This option is very flexible, so please see plotly’s documentation for the full scope of what it’s capable of. Here, I made it to display basic information about each critter.
  • layout() enables me to adjust the axis labels, but also supports many other arguments that modify your plot’s final appearance. See here for a full list.

5C. What? Your scatterplot is evolving!

But plotly doesn’t stop there. As previously shown, the package is capable of rendering many different plots – even 3D ones, taking full advantage of the digital space these plots are intended to be displayed in.

I’ll show you this by adding a z-axis on my previous plot. While I can define the trace type manually, plotly will also automatically assume a 3D scatterplot from my parameters.

# cleaning data
pokemon_gen13 <- filter(pokemon, generation == "1")

pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "1"] <- "Legendary"
pokemon_gen13$is_legendary[pokemon_gen13$is_legendary == "0"] <- "Not legendary"


# constructing our plotly graph
pok5c <- plot_ly(pokemon_gen13, x = ~defense, y = ~hp, z = ~height_m,
               color = ~is_legendary, 
               colors = c('#BF382A', '#0C4B8E'),
               opacity = 0.5,
               hovertext = ~paste('</br> Name: ', name,
                             '</br> Species: ', classfication,
                             '</br> Type: ', type1, '/', type2)) %>% 
layout(title = "Hit points by Defense points by Height (in meters)",
  xaxis = list(title = list(text = 'Defense statistic')),
  yaxis = list(title = list(text = 'Hit points')))

pok5c

6. Some extras 🎓

6A. Exporting plotly graphs as static images

While plotly is best used on a digital platform due to its interactive features, there may be a time where you want to use your beautiful, carefully drawn plot in a non-dynamic medium – like, say a paper or a print out.

The orca() function allows you to do this, although it requires you to download Orca, an open source command line tool that interacts with plotly. You can transform your graphs into .jpegs, .pngs, .pdfs…

The installation process is a bit more complicated than your usual R package (see here for full instructions), so we won’t be showing orca() in action. However, this is what code to render your plot into an image would look like:

library(orca)
orca(pok5b, "pok5b.png")

6B. Transforming ggplot graphs into plotly

Yes, it’s possible! The function ggplotly() allows you to do this.

Remember this graph from before?

pok3a

Applying our make-over with ggplot_ly():

pok3a %>% 
  style(hovertext = ~name) %>% 
  ggplotly()

To change the looks of this graph, you would have to do some modifications on the ggplot side of things, although some alterations are possible with style(). It may be better to build the graph natively in plotly() depending on your requirements.

7. Conclusion 📝

The plotly package is a powerful data visualization tool which can bring your plots and graphs to life by making them more dynamic and interactive. Like ggplot, it supports a wide variety of plots, from scatter to histograms to even heatmaps. plotly is best used in an digital setting, such as in your browser, in order take full advantage of its functionality.

Plotting is made possible with the plot_ly() function. The simple function plot_ly(dataset, x = ~xaxis, y = ~yaxis, type = 'plot') will get you started with basic functionality. The hovertext() function allows you to modify information that is shown when hovering over a point. Many other customizations are possible (type ?plot_ly() to see more), and the graph visualization can be adjusted via layout().

With these tips, your plots will be the very best, like no plot ever was!

8. Final challenges 🆗

8A.

Here are three exercises to challenge your plotly skills. Using the pokemon dataset from the package Rokemon, can you plot out the relationship of height and weight for all Fire-type Pokemon? Assign legend colors according to generation, and make sure the hover text includes the pokemon’s name and their classification.

Possible solution below - no cheating! ;)

pokemon_hw <- filter(pokemon, type1 == "fire")

hw_plot <- plot_ly(pokemon_hw, x = ~height_m, y = ~weight_kg,
               type = 'scatter',
               color = ~generation, 
               opacity = 0.5,
               hovertext = ~paste('</br> Name: ', name,
                             '</br> Species: ', classfication)) %>% 
layout(title = "Relationship between height and weight of Fire-types",
  xaxis = list(title = list(text = 'Height (meters)')),
  yaxis = list(title = list(text = 'Weight (kilograms)')))

hw_plot

8B.

We wanted to see if there was a correlation between weight and attack in ground and water type pokémons, sadly… we couldn’t find any :/ The worst part is that bug pokemons made their thing and ruined our code! Now we can’t show you that there is no correlation. But here you come, pokémon masteR, please find the BUGS in the following code, fix it and bring back the graph! Do you accept the challenge?

challenge2 <- rokemon %>%
  filter(type = c("ground","water")) %>%
  plot_ly(x = ~attack, y = ~weigth_kg, splt = ~type, 
        type == 'scatter') %>% 
  layout(title = 'Weight  vs. Attack', 
         xaxs = list(title = 'Attack'),
         yaxis = list(title = 'Weight (kilograms)'),          
         legend = list(title = list(text = 'Type')),
         plot_bgcolor = 'white')

challenge2

8C.

Empirically speaking, we think that the number of water type pokémons has decreased among generations. Can you plot a graphic to check if there is such pattern? Create a bar plot filtering pokémons by type, include water type -of course- and three more. Add colors according to the type. Share your results!

challenge3 <- pokemon %>%
  filter(type1 == c("fire", "ground", "water", "grass")) %>%
  count(generation, type1) %>% 
  plot_ly(x = ~generation, y = ~n, type = 'bar', color = ~type1)

challenge3